Effective pseudo-relevance feedback for language modeling in extractive speech summarization
نویسندگان
چکیده
Extractive speech summarization, aiming to automatically select an indicative set of sentences from a spoken document so as to concisely represent the most important aspects of the document, has become an active area for research and experimentation. An emerging stream of work is to employ the language modeling (LM) framework along with the Kullback-Leibler divergence measure for extractive speech summarization, which can perform important sentence selection in an unsupervised manner and has shown preliminary success. This paper presents a continuation of such a general line of research and its main contribution is two-fold. First, by virtue of pseudo-relevance feedback, we explore several effective sentence modeling formulations to enhance the sentence models involved in the LM-based summarization framework. Second, the utilities of our summarization methods and several widely-used methods are analyzed and compared extensively, which demonstrates the effectiveness of our methods.
منابع مشابه
Positional language modeling for extractive broadcast news speech summarization
Extractive summarization, with the intention of automatically selecting a set of representative sentences from a text (or spoken) document so as to concisely express the most important theme of the document, has been an active area of experimentation and development. A recent trend of research is to employ the language modeling (LM) approach for important sentence selection, which has proven to...
متن کاملAn Empirical Comparison of Contemporary Unsupervised Approaches for Extractive Speech Summarization
Due to the rapid-developed Internet and with the big data era coming, the automatic summarization research has been emerged a popular research topic. The aim of automatic summarization is in attempt to select important text or spoken sentence to represent the topic (theme) of original text or spoken document according to a predefined summarization ratio. In this study we frame automatic summari...
متن کاملLeveraging Effective Query Modeling Techniques for Speech Recognition and Summarization
Statistical language modeling (LM) that purports to quantify the acceptability of a given piece of text has long been an interesting yet challenging research area. In particular, language modeling for information retrieval (IR) has enjoyed remarkable empirical success; one emerging stream of the LM approach for IR is to employ the pseudo-relevance feedback process to enhance the representation ...
متن کاملExtractive speech summarization - from the view of decision theory
Extractive speech summarization can be thought of as a decision-making process where the summarizer attempts to select a subset of informative sentences from the original document. Meanwhile, a sentence being selected as part of a summary is typically determined by three primary factors: significance, relevance and redundancy. To meet these specifications, we recently presented a novel probabil...
متن کاملExtract-biased pseudo-revelance feedback
Successfully retrieving a web document is a twofold problem: having an adequate query that can usefully and properly help filtering relevant documents from huge collections, and presenting the user those that will indeed fulfill his/her needs. In this paper, we focus on the first issue – the problem of having a misleading user query. The aim of the work is to refine a query by using extracts in...
متن کامل